Rank | Count | Beginning |
---|---|---|
8 | 2237 | W |
23 | 760 | Na |
28 | 557 | Nie |
16 | 507 | To |
17 | 369 | Jak |
62 | 361 | Ale |
201 | 353 | A |
331 | 343 | Po |
173 | 342 | Z |
11 | 305 | Według |
74 | 288 | Do |
3 | 242 | I |
18 | 212 | Od |
181 | 193 | O |
118 | 168 | Czy |
37 | 167 | Jeśli |
212 | 158 | Jednak |
44 | 157 | Co |
5 | 145 | Teraz |
673 | 144 | Jest |
580 | 134 | Zdaniem |
188 | 128 | Za |
79 | 122 | Jego |
328 | 104 | Dlatego |
63 | 100 | Gdy |
66 | 94 | Dzięki |
368 | 90 | Tak |
461 | 86 | Dla |
162 | 85 | Podczas |
939 | 83 | Kiedy |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV